Robust text-independent speaker identification over telephone channels
نویسندگان
چکیده
This paper addresses the issue of closed-set text-independent speaker identification from samples of speech recorded over the telephone. It focuses on the effects of acoustic mismatches between training and testing data, and concentrates on two approaches: 1) extracting features that are robust against channel variations and 2) transforming the speaker models to compensate for channel effects. First, an experimental study shows that optimizing the front end processing of the speech signal can significantly improve speaker recognition performance. A new filterbank design is introduced to improve the robustness of the speech spectrum computation in the front-end unit. Next, a new feature based on spectral slopes is described. Its ability to discriminate between speakers is shown to be superior to that of the traditional cepstrum. This feature can be used alone or combined with the cepstrum. The second part of the paper presents two model transformation methods that further reduce channel effects. These methods make use of a locally collected stereo database to estimate a speaker-independent variance transformation for each speech feature used by the classifier. The transformations constructed on this stereo database can then be applied to speaker models derived from other databases. Combined, the methods developed in this paper resulted in a 38% relative improvement on the closed-set 30-s training 5-s testing condition of the NIST’95 Evaluation task, after cepstral mean removal.
منابع مشابه
A na ve de-lambing method for speaker identification
This paper addresses the issue of close-set text-independent speaker identification from speech samples recorded over telephone. We have known that the speaker identification performance variability can be attributed to many factors. One major factor is the inherent differences in the recognizability of different speakers. In speaker recognition systems such differences are characterized by the...
متن کاملA Naïve De-lambing Method for Speaker Identification
This paper addresses the issue of close-set text-independent speaker identification from speech samples recorded over telephone. We have known that the speaker identification performance variability can be attributed to many factors. One major factor is the inherent differences in the recognizability of different speakers. In speaker recognition systems such differences are characterized by the...
متن کاملRobust text-independent speaker identification using Gaussian mixture speaker models
This paper introduces and motivates the use of Gaussian mixture models (CMM) for robust text-independent speaker identification. The individual Gaussian components of a GMM are shown to represent some general speaker-dependent spectral shapes that are efTective for modeling speaker identity. The focus of this work is on applications which require high identification rates using short utterance ...
متن کاملProbabilistic Neural Networks Combined with Gmms for Speaker Recognition over Telephone Channels
In this paper we study the applicability of Probabilistic Neural Networks (PNNs) as core classifiers to medium scale speaker recognition over fixed telephone networks. In particular, banking applications with up to 400 enrolled speakers and short training times are targeted. Two PNN-based open-set text-independent systems for Speaker Identification and Speaker Verification correspondingly are p...
متن کاملText-independent Forensic Speaker Identification Using Telephone Speech
On several occasions we have been asked by police officials whether it is feasible for investigative purposes to automatically identify a speaker using speech from intercepted telephone conversations. Since forensic phoneticians are typically not involved in speaker identification but in speaker verification we tend to reformulate this question in : Is it feasible for investigative purposes to ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Trans. Speech and Audio Processing
دوره 7 شماره
صفحات -
تاریخ انتشار 1999